Innovations in Czech audio-visual speech synthesis for precise articulation
نویسندگان
چکیده
This paper presents new steps toward animation of precise articulation. The acquisition of audio-visual corpus for Czech and new method for parameterization of visual speech was designed to obtain exact speech data. The parameterization method is primarily suitable for training a data driven visual speech synthesis systems. The audio-visual corpus includes also specially designed test part. Furthermore, the paper presents the collection of suitable text material for test of visual speech perception and also the procedure how can be the test performed. The synthesis method based on the selection of visual unit and animation model of talking head is extended. The synthesis system is objectively and subjectively evaluated.
منابع مشابه
Czech Audio-Visual Speech Synthesis with an HMM-trained Speech Database and Enhanced Coarticulation
The task of visual speech synthesis is usually solved by concatenation of basic speech units selected from a visual speech database. Acoustical part is carried out separately using similar method. There are two main problems in this process. The first problem is a design of a database, that means estimation of the database parameters for all basic speech units. Second problem is a way how to co...
متن کاملRealistic Face Animation for a Czech Talking Head
This paper is focused on improving visual Czech speech synthesis. Our aim was the design of a highly natural and realistic talking head with a realistic 3D face model, improved co-articulation, and a realistic model of inner articulatory organs (teeth, the tongue and the palate). Besides very good articulation our aim was also expression of the mimic and emotions of the talking head. The intell...
متن کاملINTERSPEECH 2006 1 sing Dominance Functions and udio - Visual Speech Synthesis
This paper presents results of training of coarticulation models for Czech audio-visual speech synthesis. Two approaches for solution of coarticulation in audio-visual speech synthesis were used, coarticulation based on dominance functions and visual unit selection. For both approaches, coarticulation models were trained. Models for unit selection approach were trained by visualy clustered data...
متن کاملModeling Co-articulation in Text-to-Audio Visual Speech
This paper provides our approach to co-articulation for a text-to-audiovisual speech synthesizer (TTAVS), a system for converting the input text to video realistic audio-visual sequence. It is an image-based system modeling the face using a set of images of a human subject. A concatenation of visemes –the corresponding lip shapes for phonemes— can be used for modeling visual speech. However, in...
متن کاملCzech audio-visual speech corpus of a car driver for in-vehicle audio-visual speech recognition
This paper presents the design of an audio-visual speech corpus for in-vehicle audio-visual speech recognition. Throughout the world, there exist several audio-visual speech corpora. There are also several (audio-only) speech corpora for in-vehicle recognition. So far, we have not found an audiovisual speech corpus for in-vehicle speech recognition. And, we have not found any audio-visual speec...
متن کامل